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The solution dynamics of antibodies are critical to antibody function. We explore the internal solution dynamics of 
antibody molecules through the combination of time-resolved fluorescence anisotropy experiments on IgGI with 
more than two microseconds of all-atom molecular dynamics (MD) simulations in explicit water, an order of magnitude 
more than in previous simulations. We analyze the correlated motions with a mutual information entropy quantity, 
and examine state transition rates in a Markov-state model, to give coarse-grained descriptors of the motions. Our MD 
simulations show that while there are many strongly correlated motions, antibodies are highly flexible, with F ab and F c 
domains constantly forming and breaking contacts, both polar and non-polar. We find that salt bridges break and reform, 
and not always with the same partners. While the MD simulations in explicit water give the right time scales for the 
motions, the simulated motions are about 3-fold faster than the experiments. Overall, the picture that emerges is that 
antibodies do not simply fluctuate around a single state of atomic contacts. Rather, in these large molecules, different 
atoms come in contact during different motions. 



Introduction 

We are interested in understanding the solution dynamics of 
antibodies. The solution dynamics of antibodies is critical to 
antibody function. The main body of information comes from 
a small number of full-length antibody structures, but static 
structures do not give insight into the dynamics. The internal 
dynamics of multidomain proteins has been a topic of interest 
in a number of different studies. 1 " 7 The focus of these studies has 
been on fluctuations in the solvent around the protein, the sec- 
ondary structure, the domain-domain interactions and the con- 
formations of the amino acid side chains. 

Here, we study the internal dynamics of the IgGI class of anti- 
bodies. The IgGI class of antibodies is the most abundant human 
IgG subclass and the template for the majority of antibody drugs. 
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IgGI is composed of four polypeptide chains, two heavy chains 
(HC) and two light chains (LC) . These four chains fold into three 
domains: two F ab domains that contain determinants for antigen 
binding and an F c domain responsible for the effector function 
and binding of the F c receptor proteins. 8 From the early studies 
of antibody structure, it was apparent that F ab domains were con- 
nected to the F c through the unstructured linker (hinge region), 
rendering them capable of binding to antigens separated by a 
range of distances. 910 In addition, electron microscopy reports on 
immobilized complexes provide evidence for the significant axial 
rotation of the F ab domain. 1112 In subsequent years, a wide range of 
biophysical techniques have been used to investigate the dynam- 
ics of these versatile molecules and to look into the role of the 
hinge region in the structure and function of the antibodies. 13 " 16 
Substantial efforts have been made by many investigators to obtain 
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Figure 1. Locations of the engineered cysteines are shown as blue 
spheres. Antibody domains are labeled (V L , light chain variable region; 
V H , heavy chain variable region; C L , light chain constant region; C H1 , C H2 
and C H3 , heavy chain constant regions). 



high-resolution X-ray crystal structures of the intact antibodies, 
but the lack of a stable defined solution structure of the antibodies 
has prevented their crystallization and only few structures of full- 
length antibodies are currently available. 1718 Low-resolution cryo- 
electron tomography studies have been very insightful in defining 
the conformational space explored by the antibodies. 19 ' 20 

Previous studies suggest that the dynamics of the antibodies 
could play a role in their function (antigen binding, complement 
activation) and solution stability; however, the molecular details 
of these observations remain unclear. 21 " 25 The purpose of the pres- 
ent work is to look at the inter- and intra-domain motions, their 
timescales of motion, and to understand which amino acid resi- 
dues contribute to the local and global flexibility. We do this by 
combining the insights obtained by time-resolved spectroscopy 
experiments and molecular dynamics simulations. Our MD sim- 
ulations extend an order of magnitude longer than prior simula- 
tions on full-length antibodies, and we use this data to construct 
a Markov model characterizing the major conformational species 
in antibody solutions. Our results indicate that F ab and F c frag- 
ments form multiple meta-stable protein-protein interactions, 
with heterogeneous protein-protein interaction surfaces based 
on multiple polar interactions. 

Results and Discussion 

Comparison of experimental and computational results. In this 
work, we used two independent approaches to look at the dynam- 
ics of a large multidomain biological molecule: time-resolved 
fluorescence anisotropy that allows determination of the rota- 
tional correlation times of the fluorescence probes site-specifically 
conjugated to the molecule of interest and all-atom molecular 
dynamics simulations that give us atomic detail structure of our 
system as it evolves over time in a given solution under defined 
conditions. To test the ability of the computational method in 
predicting the dynamics, we compared calculated rotational cor- 
relation times with the experimental values for three different 



systems of increasing size and complexity - free fluorescence 
probe, antibody domains and a full-length antibody. 

Dye dynamics. Free dye. For a dye model, we used a common 
fluorescence probe Alexa 488 maleimide, which has an excited 
state lifetime commensurate with its rotational correlation time. 
This probe possesses a long (five carbon) linker between the fluo- 
rophore and a maleimide group that confers a certain degree of 
the flexibility to the molecule alone and once it is covalently con- 
jugated to the protein of interest. Rotational correlation times 
for Alexa 488 maleimide have been reported previously and 
also estimated from MD simulations in two different solvents: 
water and methanol. 26 We performed computational analysis 
of this fluorescence probe to examine its dynamics in a similar 
way. Our calculated anisotropy decays were best fit to the sin- 
gle exponential function with decay times of 71 ps and 74 ps 
for water and methanol (data not shown) respectively and are 
in good agreement with the reported values from MD simula- 
tions (51 ps and 86 ps). 26 The observed difference between the 
calculated values here and those reported can be attributed to 
few differences in simulation parameters: (a) longer simulation 
times in this study (20 ns here compared with 1 ns in Schroder 
et al. 26 ); (b) different model for methanol used (OPLS forcefield 
parameters for methanol here compared with methanol param- 
eters from GROMACS forcefield); (c) all atom representation of 
Alexa 488 used here vs. united- atom GROMACS force field used 
previously. In both cases, calculated values are consistently lower 
than the measured parameters (170 ps and 210 ps) and the dif- 
ference is due to the lower than experimentally measured solvent 
viscosity used in simulations. 27 These initial results suggest that 
predicted rotational correlation times will also be shorter for the 
other two systems we examined (antibody fragments and a full- 
length antibody) . 

Dye covalently coupled to the antibody. Description of the dye 
behavior at the site of the conjugation to the protein is a compli- 
cated topic and a variety of mathematical models have been used 
to account for the movement of the probe in the measured fluo- 
rescence parameters. 28 ' 29 The reorientation time of the probe and 
the direction of the excitation and emission transition dipoles 
of the fluorophore are all parameters that can potentially have 
large impact on the measured values (whether these are steady- 
state or time-resolved values). In this work, we use three different 
full-length THIO-mAb derivatives of trastuzumab (humanized 
monoclonal antibody derivatives with engineered cysteines at 
defined locations; each THIO-mAb has two engineered cysteine 
substitutions) and their fragments (F ab and F ) for site-specific 
conjugation with N-(l-pyrene) maleimide. 30 ' 31 The locations of 
the engineered cysteines (two different THIO-MAbs with cys- 
teines in the F ab domain — HC121 and LC205, one THIO-mAb 
with cysteine in F c domain — HC403) are shown in Figure 1. 
To characterize the behavior of the fluorescence probe at each of 
these conjugation positions, we ran all-atom MD simulations of 
the fragments of these THIO-MAbs with probes at each of the 
conjugation positions: F ab -HC121, F ab -LC205 and F c -HC403. 
Starting structures for F ab trajectories were based on the crystal 
structure of the trastuzumab F ab (Protein Data Bank ID 1N8Z). 32 
The F c structure was generated by removing the F ab domains from 
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the full-length human IgGl crystal structure (PDB ID 1HZH) 
up to the upper hinge lysine residue (F structure generated this 
way contained the entire intact F c domain and all of middle and 
lower hinge) . 33 

Each conjugation position provides a unique environment for 
the pyrene probe in terms of the combination of the hydrophilic 
and hydrophobic side chains of the protein that surround the 
probe. Pyrene itself is highly hydrophobic and thus expected to 
interact with the hydrophobic residues in its vicinity. The crystal 
structure of the protein provides a static picture of the environ- 
ment of the probe at each conjugation position and examination 
of the number of hydrophobic residues around the attachment 
sites can serve as a qualitative measure of the extent of the dye 
mobility. In this work, we substituted three different residues 
with cysteines for site-specific conjugation: HC Alal21Cys, LC 
Val205Cys and HC Ser403Cys. Of these three, LC Val205 seems 
to be the most buried and has the highest number of hydrophobic 
residues within 5 A radius, suggesting that the probe at this site 
would be the most rigid. 

MD trajectories provide much more detailed picture of the 
dye environment and how it changes over the course of the simu- 
lation. Specifically, the parameters that can be estimated from the 
crystal structure (e.g., surface exposure, potential interaction sur- 
faces near the dye) change over time and may not be accurately 
represented in the static structure. To gain a better insight into 
dye behavior on the protein surface, we explicitly modeled pyrene 
at each conjugation site. The orientation of the dye was randomly 
selected at the beginning of each MD run. Two F ab trajectories 
were collected for each of the two labeling positions on the F ab 
(F ab -HC121 and F ab -LC205) and one trajectory for F c with two 
F c -HC403 probes. We evaluated the mobility of the probe based 
on two different parameters: trajectory- averaged solvent acces- 
sible surface area (SASA) of the probe and the autocorrelation 
function decay time of the probe vector in the frame of the pro- 
tein (see Material and Methods). Table 1 provides a summary 
of the trajectory- averaged SASA for all three positions. Pyrene 
at position HC121 seems to be the most exposed (largest SASA 
value) and, at position LC205, the most buried. Table 1 lists the 
trajectory- averaged SASA values for the conjugation residues esti- 
mated from the F ab and F c trajectories with no dyes. All SASA 
values for the residues themselves are significantly lower than the 
corresponding values for the pyrene probe conjugated to those 
sites, which is a reflection of the presence of the linker between 
the residue and a probe. In the case of the residues themselves, 
F c -HC403 is the most exposed and the other two residues display 
comparable surface exposure. Interestingly, two pyrenes on F c 
display different surface exposure even though they are chemi- 
cally identical, i.e., belong to chemically identical chains and are 
conjugated to the identical positions. Closer examination of the 
two chains in the starting structure of the F c trajectory shows that 
structurally they are not identical (RMSD = 1.5 A) and remain 
distinct over the course of the 100 ns MD trajectory. 

As expected, the mobility of the probe varies depending on 
the attachment position as judged by the autocorrelation func- 
tion decay time in protein frame (Table 2). Pyrene at HC121 
position appears to be the most mobile in our simulations. As in 



Table 1. Solvent accessible surface area (nm 2 ) of the pyrene probe or the 



side chain it is covalently attached to 





HC121 


LC205 


HC403 


Pyrene-1 


388 


285 


310 


Pyrene-2 


379 


291 


351 


Side Chains 


55 


57 


101 



Side chain values were derived from the MD trajectories of the uncon- 
jugated fragments. 



Table 2. Computationally derived pyrene dye rotational correlation 
times predicted from MD data 





HC121 


LC205 


HC403 


Pyrene-1 


12 ns 


423 ns 


1017 ns 


Pyrene-2 


7 ns 


127 ns 


68 ns 



HC121 and LC205 data are from two different trajectories. HC403 data 
are from a single trajectory with two probes (protein frame). 



Table 3. Comparison of experimental and molecular dynamics 



Fluorophore Conjugation Position: F ab and F c 


Method 


F -HC1 21 F h -LC205 F -HC403 

ab ab c 


Experimental 


20 ± 3 ns 24 ± 2 ns 11 ± 3 ns 


MD (lab frame) 


3.4 ± 0.4 ns 11.1 ± 2.3 ns 11.2 ± 3.8 ns 



Rotational correlation times for trastuzumab fragments (F ab and F c ) de- 
termined using time-resolved anisotropy decay or calculated from the 
MD trajectories (lab frame). 

the case of surface exposure, two pyrenes on F c domain (position 
HC403) remain distinct in terms of reorientation times, with one 
being faster than the other (68 ns vs. 1028 ns). In each MD run, 
we observe pyrene probe forming transient interactions with the 
hydrophobic patches in the vicinity of the attachment site. In the 
case of the two F c probes, different starting orientations were cho- 
sen for the MD run, as reflected in the surface exposure (Table 
1). As a result, HC403-1 forms a long-lived interaction with anti- 
body side chains resulting in much longer rotational correlation 
time in the protein frame. 

Antibody dynamics. F b and F dynamics. Modeling pyrene 
probes in MD trajectories allows direct comparison of the hydro- 
dynamic properties of the antibody fragments as measured 
experimentally with the predicted parameters. F ab simulation was 
performed in a 850 nm 3 box with approximately 25,500 water 
molecules and in the case of F c — 1,000 nm 3 box with approxi- 
mately 33,700 solvent molecules. Simulations were performed at 
pH 7.0 and 25°C. At this pH, the trastuzumab F ab is positively 
charged (+5). Chloride ions have been added to the simulation 
box to make the system neutral. The F c is neutral at this pH. 

Experimental and computational results for rotational correla- 
tion time of the antibody fragments are summarized in Table 3. 
In all cases, experimental anisotropy decay data was fit to the 
sum of two exponentials and an average anisotropy decay time 
was calculated (see Material and Methods). As can be seen from 
Table 3, the experimental results give approximately the same 
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Table 4. Rotational correlation times calculated for the bonds of the given resides from the MD data of the antibody fragments 







F ab -HC121 


F ab -LC205 






F c -HC403 


Trajectory 




C a -N 




C a -N 




C a -N 


with dyes 


13 ± 2 ns 


16 ± 2 ns 


16±2ns 


12 ±3 ns 


15 ± 2 ns 


16 ± 1 ns 


no dyes 


10 ±2 ns 


16 ±0.1 ns 


13±4ns 


11 ± 1 ns 


16 ± 7 ns 


15 ±1 ns 



Trajectories with or without fluorescence probe present were used. 



rotational correlation time for both conjugated positions on the 
F ab , whereas the F c gives a slightly faster rotational correlation 
time. As the F ab and F c volumes are comparable, we attribute the 
differences between F , and F rotational correlation rates to local 

ab c 

flexibility of the protein around the attachment position or an 
altered intrinsic flexibility of the fluorescence probe at each posi- 
tion (as defined by distinct surface exposure and dye-side chain 
contacts). In addition, the extent of fluorescence depolarization 
depends on the orientation of the emission transition moment of 
the fluorescence probe relative to the axis of rotation of the mac- 
romolecule to which it is conjugated. 34 Previous reports on time- 
resolved anisotropy of the macromolecules attempt to extract 
individual contributions of different sources of depolarization 
(e.g., dye, domain and whole protein tumbling); 35 however, 
because the independent determination of individual contribu- 
tions is not possible with the experimental tools that we have and 
the problem of defining these contributions from a single decay 
does not have a unique, well-defined solution, use of the single (or 
the average) decay time is an appropriate measure of dynamics for 
a given labeling position. 24 

As can be seen from Table 3, predicted rotational correla- 
tion times from the MD data are in overall agreement with the 
experimental measurements, but there are obvious differences. 
As mentioned before for the free dye, viscosity of the water model 
used here is lower than the experimental viscosity, leading the 
modeled dynamics to be faster than the observed values (with the 
exception of HC403). In addition, we hypothesize that the local 
protein environment of the probe (based in this case on the crys- 
tal structure) may not necessarily be the most representative state 
found in freely diffusing molecules. Rather, it is likely (for at least 
some of the dye conjugation positions) that local conformation 
determined by crystallography or a given local structure resides in 
a local energy minima, thus biasing the dynamics of the probe for 
that specific local environment. As a result, we observe that the 
apparent tumbling time of the F ab — as predicted from pyrene at 
position HC121 — is faster than the other two positions, a result 
of the greater flexibility of the dye itself at this position. To gain 
a better understanding of the dye dynamics, starting structures 
based on NMR data would probably offer an advantage; how- 
ever, no such structures are available for the antibodies due to 
their large size. 

Rotational correlation time can also be predicted from the 
autocorrelation function decay of the bonds of the dye attachment 
residue; specifically, C a -Cp and C a -N bonds of these residues 
(Table 4). These values were determined from the trajectories of 
the fragments without probes and trajectories with probes pres- 
ent. In both cases, rotational correlation times are similar (the 
presence of the dye does not appear to affect the local dynamics 



of the bonds) and compare well with the experimentally mea- 
sured parameters (compare with the experimental values in Table 
3). In this case, the determined rotational correlation time does 
not explicitly incorporate dye dynamics at each specific position, 
but it does offer a good approximation of the local dynamics as 
judged by the similarity of the calculated and experimental val- 
ues. From the previous points, it follows that measuring dynam- 
ics from explicitly modeled probes and from the bonds of the 
residues gives similar results and either can be used to predict the 
dynamics. 

Dynamics of full-length antibody; all atom molecular dynam- 
ics simulations. To test the ability of molecular dynamics one 
step further, we compared the parameters obtained from 1 |xsec 
(total) of all-atom MD simulations of full-length antibody with 
experimentally measured parameters. 

Full-length trastuzumab is computationally challenging to 
model with molecular dynamics simulations because of its large 
size (1328 residues, in contrast with commonly modeled proteins 
that have ^50 residues or less) and multiple domains (almost all 
modeled proteins are single domain, whereas trastuzumab has 
three distinct domains connected via flexible linker). A critical 
challenge is to obtain enough sampling of the conformational 
dynamics. We address this issue by comparing our computa- 
tional results with fluorescence anisotropy. 

We maximize conformational sampling in our simulations by 
the following two approaches. First, we run our simulations on a 
multi-processor Linux cluster using optimized molecular dynam- 
ics software (multiprocessor GROMACS 4 36 with virtual hydro- 
gens, allowing for 4 fs timesteps (see Materials and Methods for 
more details) . We collected approximately 2 |xs of data, an order 
of magnitude more than has been previously published for full- 
length antibodies. 37 Second, we use multiple starting structures 
obtained by (1) homology models from two different templates 
(see below), and (2) coarse-grained modeling with an anisotro- 
pic network model 38 to calculate the extremes of the lowest fre- 
quency normal modes (see Materials and Methods for details). 
Simulations where performed in explicit water in a 43,000 nm 3 
box. The box contained 6,870 protein atoms and approximately 
135,000 molecules of solvent. 

Initial full-length trastuzumab homology models were gen- 
erated based on two crystal structures available in PDB public 
database, human IgGl (PDB ID 1HZH) and mouse IgGl (PDB 
ID 1IGY) (Fig. 2). The templates are structurally distinct from 
one another in the following two ways: (1) the F ab fragments are 
rotated ^180° along its long axis in one structure vs. the other, 
and (2) in one template there is significant inter-domain interac- 
tion between one F , and the F , whereas in the other there is less 

ab c 

interaction between the fragments and the F c is perpendicular to 
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the plane of the 2 F ab s. The first model was 
based on template 1HZH, 17 a IgGlK human 
antibody from an HIV immune patient. 
The second model was based on template 
1IGY, 18 a mouse anti phenobarbital antibody. 
1N8Z trastuzumab bound to the extracellu- 
lar domain of the HER2 receptor was used 
as the F ab template for both homology mod- 
els. 32 The G2 form of sugar was used, with 
coordinates taken from 1L6X, 39 the crystal 
structure of rituximab F . The two struc- 

c 

tures differ significantly so we chose to cre- 
ate both structural models because there is 
no compelling evidence for the prevalence of 
either one in solution. Anisotropic network 
model (ANM, see Materials and Methods) 
was used to generate starting structures for a 
total of 12 MD trajectories (six for 1HZH- 
based models and six for HGY-based models) 
70 nsec to 500 nsec in length. All starting 
structures are pictured in Figure SI. Table SI 
summarizes all the simulations we ran. In 
the case of full-length antibody simulations, 
we did not model fluorescence probes. The 
parameters describing protein dynamics were 
calculated from the bonds of the correspond- 
ing residues. 

Measured anisotropy of full-length trastu- 
zumab. Results for the measured rotational 
correlation times are summarized in Table 5. 
For the full-length mAb, each conjugation 
position gives a different value of the rota- 
tional correlation time. We attribute these differences primarily 
to the unique environment of the probe. The F ab -LC205 shows a 
slower rotational correlation time as compared with F ab -HC121, 
reflecting a more constrained environment in the full-length 
mAb. This difference was not apparent in the measured anisot- 
ropy of the fragments, demonstrating that the local environment 
between the two positions differs between the fragment and a 
full-length antibody. 

To better understand these differences, we also performed 
anisotropy measurements under three different salt concentra- 
tions to assess the impact of ions on the dynamics of the anti- 
body. As can be seen from Table 5, salt concentration does not 
have an effect on the apparent rotational correlation times, and 
thus the apparent hydrodynamic radius of the molecule remains 
unchanged. In addition, as a measure of sensitivity of time- 
resolved anisotropy to the changes in the dynamics of the mac- 
romolecule, we determined the rotational correlation times for 
the trastuzumab THIO-mAb that lacks hinge disulfide bonds 
(to achieve this, hinge cysteine residues have been substituted to 
serines; we refer to this construct as "hingeless"). The fluores- 
cence probe in this case was located on the F ab fragment (LC205). 
Absence of the covalent link between the two heavy chains should 
allow more flexibility to the F ab fragments and effectively decou- 
ple the F ab motion from the rest of the antibody. As expected, 






Figure 2. Crystal structures of human (A) and mouse (B) IgGI used for generating homology 
models of trastuzumab. In both structures, F ab fragments are structurally distinct and have 
been treated as distinct domains throughout the manuscript. 



Table 5. Experimental (first three rows) and calculated (MD) rotational 
correlation times for the full-length trastuzumab as determined from 
three different conjugation positions 

Fluorophore conjugation position: 
Full length mAb 

F -HC121 



Solution conditions 

pH 7.0, no salt 
pH7.0, 150mMNaCI 
pH 7.0, 450 mM NaCI 

MD (lab frame) 



46 ± 4 ns 
44 ± 5 ns 
49 ± 4 ns 
41 ± 24 ns 



F h -LC205 

ab 

73 ± 7 ns 
73 ± 5 ns 
75 ± 5 ns 
43 ± 21 ns 



F -HC403 

c 

35 ± 6 ns 
33 ± 4 ns 
37 ± 5 ns 
47 ± 16 ns 



Table 6. Rotational correlation times for trastuzumab LC205 full lenth 
antibody derived from E. coli or CHO cells (intact antibodies) and trastu- 
zumab derivative with hinge disulfides removed ("hingeless"), in native 
and reduced states 



Full length LC205 


Solution conditions 


E coli 


Eco//"hingeless" CHO 


pH 7.0, 150 mM NaCI 


73 ± 5 ns 


54 ± 6 ns 74 ± 9 ns 


DTT, 12 h 


53 ± 7 ns 


49 ± 6 ns 50 ± 5 ns 
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for this molecule, we see the decrease in the correlation time 
(see Table 6). The value of the rotational correlation time for 
the F ab -LC025 within the "hingeless" constructs is intermediate 
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1 2 3 4 5 6 7 8 9 10 11 12 
FAB1 FAB 2 FC 

herceptin backbone CA atoms 



Figure 3. Correlated motion between all residue pairs in trastuzumab is 
represented as a 'heat map'. In this 'heat map', the x- and y-axes identify 
the residue numbers and the color indicates the amount of correla- 



between the free F , and a F , in full length trastuzumab and thus 

ab ab t> 

it appears that F ab motion does not become completely decoupled 
from the rest of the antibody. The same table shows the values 
of rotational correlation times for antibodies derived from two 
different sources (E. coli and CHO cells). The only difference 
between the two is their glycosylation state (E. coli-derived anti- 
bodies lack sugars). The rotational correlation times for these 
molecules are the same, suggesting that the absence of the glyco- 
sylation does not substantially alter the apparent hydrodynamic 
radius of the molecule. Reduction of the intact antibody (under 
these conditions only interchain disulfides are reduced) results in 
the value of the rotational correlation time comparable to the one 
for the "hingeless" derivative (second row, Table 6). 

Comparison of measured anisotropy vs. calculated. The last row 
of the Table 5 shows the calculated rotational correlation times 
based on the MD data for the full-length antibody. These values 
represent an average of all 12 trajectories based on the two initial 
models since the decay times calculated from trajectories based 
on these models were comparable (data not shown). It appears 
that the dynamics as observed using MD simulations is not 
affected by the choice of the initial structural model. 

As mentioned previously, the calculated rotational correla- 
tion times were estimated from the decays of the autocorrelation 
functions of the C a -C p and C a -N bonds of the corresponding 



residues (and not explicitly modeled probes). As a result, the 
predicted values lack the contribution from dye dynamics. 
Despite this, calculated values are in excellent agreement with 
the measured correlation times, again justifying the use of the 
computational tools in describing solution dynamics of large 
multi-domain proteins. 

Computational dynamics of trastuzumab: Correlated 
motions. Having validated our computational models with fluo- 
rescence anisotropy, we next describe the conformational dynam- 
ics of trastuzumab since correlated motion in proteins is essential 
to function. 40 To quantify the correlations, we use mutual infor- 
mation (MI), a metric from information theory that captures all 
correlations including nonlinear and higher order correlations. 41 
We use MI as a metric because MI in molecular dynamics has 
been shown to capture biologically relevant dynamics, 40 ' 42 " 44 and 
a comparative analysis revealed that correlations captured by MI 
were more functionally relevant than methods such as principal 
component analysis. 40 

MI between all residue pairs was calculated and normalized 
as described in Materials and Methods. Briefly, the trajectory 
of each residue was represented by the distance of the backbone 
C a atom from its moving average position (the average position 
in a 500 ps window centered on the current time step). MI is 
normalized by the joint entropy H(X,Y) so that MI between 
lower entropy residues is weighted equally with MI between 
higher entropy residues. In this report, normalized MI, 
MI norm , are provided. Figure 3 shows the MI norm between all 
1328 x 1328 residue pairs in the following order: F ab fragment 
1 (light chain 1, and the variable and C H1 domain from heavy 
chain 1), F ab fragment 2 (light chain 2, and the variable and 
C H1 domain from heavy chain 2), and the F c fragment (C H2 and 
C H3 from heavy chain 1, and C H2 and C H3 from heavy chain 
2). The rows and columns of the MI matrix in Figure 3 are the 
residues, and the color-coded values in the matrix correspond 
to the amount of correlation seen in the dynamics as quantified 
by MI norm . Strong correlated motion between two residues (high 
MI norm ) is red, and two residues that move independently (low 
MI ) are blue. The definition of MI results in the following 

norm 7 e> 

two properties of the matrix: (a) the highest MI is between a 
residue and itself [MI(X,X)], the matrix diagonal (perfect cor- 
relation), and (b) the matrix is symmetric because MI(X,Y) = 
MI(Y,X). In all figures, we omit the values on the diagonal to 
redistribute the color scale for clarity (instead of the scale going 
from 0 to 1, the scale goes from 0 to the maximum non-diag- 
onal MI norm value). Strong correlated motion in residues that 
are adjacent in primary sequence is seen as the thickness of the 
diagonal. The secondary structure of trastuzumab is largely 
antiparallel (3 strands arranged in (3 sheets, visible as high MI 
lines perpendicular to the diagonal because residue X in one (3 
strand has high MI to residue Y in an adjacent (3 strand, residue 
X + 1 has high MI to residue Y - 1, etc. As expected, there is a 
strong correlation in motion between residues that are adjacent 
or interacting. 

Correlated motion reveals tight coupling within the Ig 
domains. MI reveals a high degree of coupling within each Ig 
domain (Fig. 4; each of the 12 Ig domains are visible as bright 



tion motion in the molecular dynamics simulations (red, high; blue, 
low). Correlation is calculated by Ml norm (a general form of correlation 
that takes both linear and nonlinear correlation into account). Ml norm 
ranges from 1 (maximum correlation; e.g., Ml between a residue and 

-* x > ~> > norm 

itself) and 0 (two residues that move independently). To expand the 
color scale to clarify the figures, the diagonal line representing Ml norm 
between a residue in itself is not included in the color scale. The F . and 

ab 

F c fragments are easily identifiable as four Ig domains in which intra-lg 
domain residues have highly correlated dynamics. 



www.landesbioscience.com 



mAbs 



311 



squares in the matrix). Residues 
within each of the 12 Ig domains are 
most coupled to other residues within 
the same Ig domain and there is tight 
coupling throughout most residues 
in the domain. Secondary structure 
interactions between the (3 strands 
of the (3 sheets are the most tightly 
coupled (e.g., interactions between 
anti-parallel strands are visible as 
lines perpendicular to the diagonal). 
This suggests that residues within an 
Ig domain move as a unit. 

Whereas there is no direct 
experimental evidence available for 
coupled motion in Ig domains, the 
following properties of antibodies 
are consistent with tightly coupled 
dynamics between residues in Ig 
domains. Antibodies are highly ther- 
mostable. 45 Ig domains are modular, 
which can be seen in the following 
two properties. First, many immune 
proteins are composed of multiple Ig 
domains in various configurations. 
Second, proteins composed of sub- 
sets of the Ig domains of full-length 
antibodies (e.g., F ab alone, or just 
the variable Ig domains that contain 
the antigen binding sites) maintain 
similar antigen binding function. 
Finally, Ig domains are structur- 
ally conserved across proteins, 45 ' 46 
as well as between various states 
(e.g., the RMSD between trastu- 
zumab bound to antigen 32 or alone 47 is 2.5 A). 

Hinge residues move independently and have the greatest 
phi-psi local conformational entropy. The hinge region acts as 
a flexible linker between the fragments. Residues in the hinge 
have been grouped into three regions according to their relative 
orientation to the inter-chain disulfide bonds: the upper hinge 
consists of residues between the disulfide bonds and the F ab frag- 
ment, the middle hinge consists of residues near the disulfide 
bonds, and the lower hinge consists of residues between the 
disulfide bonds and the F c fragment. 

MI of residues in the hinge region reveals that these residues 
move independently from the rest of the antibody (Fig. 5A), 
consistent with their characterization as a flexible linker between 
fragments. To further characterize flexibility, the phi-psi local 
conformational entropy of every residue was calculated (see 
Materials and Methods for more details). Briefly, the Shannon 
entropy in phi/psi angles was calculated for all residues of the 
full-length antibody. Residues in the hinge region were the most 
flexible (exhibited the highest phi/psi entropy), along with resi- 
dues at the N- and C-termini. Figure 5B shows the conforma- 
tional entropy of all residues in the hinge region. Residues with 
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Figure 4. Average correlated motion between the 12 Ig domains. (A) Residues within an Ig domain 
move in a correlated fashion (Fig. 1), and we average all intra-lg domain residues to highlight correlated 
motions between Ig domains. Each square on the diagonal is the average correlated motion between 
residue pairs in the same Ig domain. Off-diagonal squares indicate correlated motion between Ig do- 
mains. High correlated motion (high Ml ) is red; low correlated motion (low Ml ) is blue. (B) Cartoon 

-? v -? norm' ' v norm' v ' 

representation of the correlated motion between Ig domains. The highest correlation is seen between 
intra-F ab variable Ig domains and between CH3 Ig domains. Intra-F ab CHI Ig domains have correlated 
motion that is slightly lower than the variable and CH3 Ig domains. CH2 domains, which are glycosyl- 
ated, move relatively independently (low correlated motion). With the exception of CH2, Ig domains in 
the "horizonal" direction (protein-protein interaction along the direction of the (3 strands) move in a 
correlated fashion and Ig domains in the "vertical" direction (domain-domain interactions between the 
loops at the ends of p strands) do not move in a correlated fashion. Correlated motion between the two 
intra-F ab Ig domains could facilitate binding to antigen, and the lack of correlated motion between the 
variable Ig domains and CHI insulates antigen binding from effector functions. 



high conformational entropy are colored red, and residues with 
lower conformational entropy are colored yellow. Entropy clearly 
reveals the three regions of hinge residues — high entropy upper 
hinge, low entropy middle followed by high entropy lower hinge. 
The borders of these three regions are mostly created by the pro- 
line residues (this type of "mosaic" structure has been observed 
in NMR experiments 48 ). 

No common dynamics in complementarity determining 
regions. Each F ab fragment binds to an antigen via six loops called 
complementarity-determining regions (CDRs). MI of heavy 
chain CDRs and conformational entropy of the heavy and light 
chain CDRs are shown in Figure 6. Each CDR loop exhibits dif- 
ferent dynamics. Heavy chain CDR3 residues have the lowest 
MI between them and the highest conformational entropy. This 
indicates that the residues in this loop move relatively indepen- 
dently from one another and with high flexibility. The residues in 
this loop also move independently from the rest of the antibody 
(low MI between residues in the loop and residues in the rest of 
the antibody). The most tightly coupled loops (high MI between 
residues in the loop, indicating that the residues in the loop move 
together) are light chain CDR1, light chain CDR2, and heavy 
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Figure 5. The hinge region in trastuzumab is the most flexible region in the molecular dynamics simulations. Differences in dynamics delineate an 
upper, middle, and lower hinge region. (A) Correlated motion between all residue pairs in the trastuzumab hinge region represented as a 'heat map'. 
Residues with high correlated motion are colored red (high Ml norm ); residues that move independently are colored blue (low Ml norm ). Motion in hinge 
residues is less correlated with adjacent residues than Ig residues ('thinner' diagonal), and hinge residues move relatively independently from the Ig 
domains (low Ml norm between hinge and Ig residues). The middle hinge region, in which residues in one heavy chain are disulfide bonded to residues 
in the other heavy chain, has higher Ml norm between adjacent residues relative to the upper and lower hinge (middle hinge residues have a 'thicker/ 
slightly orange' diagonal relative to upper and lower hinge residues). (B) Residues in the hinge region have the highest conformational entropy. 
Conformational entropy is indicated by color on residues in the hinge. Red indicates high conformational entropy (high entropy in phi/psi angles), and 
yellow indicates medium conformational entropy. Residues in the upper and lower hinge region have the highest conformational entropy. 



chain CDR1, consistent with the canonical structure. 49 ' 50 Some 
loops have motion that is coupled to non-CDR residues in the F ab 
domain (high MI between one or two residues in the loop and the 
other F ab residues). The highest MI interactions between heavy 
chain CDR loops and non-CDR regions are shown in Figure 
6. We calculated the phi-psi conformational entropy of the F ab 
domain to further characterize flexibility, as we did for the hinge 
region (Fig. 6B). The CDR loops have the greatest conforma- 
tional entropy within the F ab domain, consistent with their previ- 
ous characterization as structurally flexible. Heavy chain CDR-3 
has the greatest conformational entropy of the six CDR loops. 

Correlated motion reveals interactions between Ig domains. 
Having established that residues within an Ig domain move in a 
coupled fashion, we reduce the 1328 x 1328 residue MI matrix 

r ' norm 

to a 12 x 12 matrix where each element in the matrix represents 
the average MI between all residues in the corresponding; Ig 

O norm r O O 

domain (s) (Fig. 4A). The MI between Ig domains are more easily 
viewed in this reduced matrix. 

Within F ab s/F c in general, there is coupled motion between 
Ig domains that interface "horizontally" along their (3 strands 
and less coupled motion between Ig domains that interface "ver- 
tically" between their loop regions (Fig. 4B). For example, the 
two variable domains within the F ab fragments move together 
(interchain), and move more independently from their adja- 
cent C H1 domains (intrachain). The exceptions to this are the 
C H2 domains, which do not move in a coupled fashion. These 
domains are glycosylated, and the sugars are the interaction site 
between the domains, resulting in reduced coupling between the 
domains. Figure 4B summarizes the coupling between the Ig 
domains. 



As in the above discussion, there is no direct experimental evi- 
dence for coupling between Ig domains; however, existing data 
are consistent with the coupling patterns in the computational 
data. In the case of more coupling between the two variable 
domains than between the variable domains and their adjacent 
C H1 domains, this data can be viewed in the context of the anti- 
gen binding function of the variable domains. Antigen binding 
occurs across three loops (CDRs) per variable domain for a total of 
six CDRs per F ab fragment (not all loops may be involved in bind- 
ing). Thus, binding is coordinated across two domains, which 
would be facilitated by coupled dynamics between them. In addi- 
tion, the variable domains move relatively independently from 
the rest of the antibody, which may decouple the antigen binding 
function from the effector function of the F c fragment. The lack 
of the coupled motion between the glycosylated C H2 domains is 
consistent with the fact that their interaction is primarily medi- 
ated through the two sugars attached to Asn297 and not between 
the residues of the domains. Carbohydrate-carbohydrate associa- 
tion is known to be weak relative to domain-domain interac- 
tions, and C H2 domains have higher crystallographic b-factor 
values than the adjacent C H3 domains, suggesting greater flexibil- 
ity. 33 The C H2 domain is known to melt at a lower temperature, 
suggesting the lack of interdomain residue interactions that may 
result in reduced stability of the C H2 domains relative to other 
domains. 50 

Between F ab s/F c , there is coupled motion between the "inner" 
Ig domains (the C m and C H2 domains), whereas the "outer" Ig 
domains (variable and C H3 domains) move relatively indepen- 
dently from the rest of the antibody (Fig. 4B). We investigate the 
molecular details of these interactions in the next section. 
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Figure 6. Heavy chain CDR-3 moves independently and has the highest conformational entropy, consistent with previous data showing that this CDR 
shows the greatest variability. (A) Correlated motion between all residues in the trastuzumab heavy chain variable Ig domain. Residues with high 
correlated motion are colored red (high Ml norm ); residues that move independently are colored blue (low Ml norm ). The three heavy chain CDR regions 
are circled. Heavy chain CDR-3 moves the most independently of all CDR regions (low correlated motion, light chain variable Ig domain not shown). 
(B) Conformational entropy (phi/psi entropy) in the variable Ig domains indicates that the greatest entropy is seen in the CDR regions. Heavy chain 
CDR-3 has the greatest conformational entropy of all CDR. Red indicates high conformational entropy (high phi/psi entropy), and blue indicates low 
conformational entropy (low phi/psi entropy) regions. 



Interactions between fragments are largely polar and non- 
specific. To investigate the nature of the coupled motion between 
fragments, we examined each trajectory in more detail. In every 
trajectory there is at least one occurrence of inter-fragment Ig 
domains in contact (contact defined as within a 6 A distance). 
Both F ab -F ab and F ab -F c contacts are made exclusively through 
the "inner" Ig domains (C H1 and C H2 ). Examination of the inter- 
action surfaces in detail revealed that the interaction surfaces 
are heterogeneous (and thus interactions are non-specific), with 
largely polar interactions mediated through multiple salt bridges. 
For example, the 500 ns trajectory started from the 1HZH homol- 
ogy model (trajectory 0), and a 90 ns trajectory started from 
the same homology model (trajectory 2); both have interactions 
between the two F ab fragments (between the C m domains). The 
interaction interfaces in these two trajectories are different. In 
trajectory 0, the interaction occurs via the following salt bridges: 
ASP336 and LYS188, GLU187 and ARG425, GLU187 and 
LYS404, GLU213 and LYS404, GLU401 and LYS183, GLU427 
and ARG211, and GLU427 and LYS190 (Fig. 7). In trajectory 2, 
the interaction occurs via the following salt bridges: ASP 151 
and ARG425, ASP336 and LYS149, and GLU195 and LYS340 
(Fig. 7). These interactions are largely through loop regions, but 
salt bridges to secondary structures also occur. These heteroge- 
neous interactions sometimes persist over the course of a given 
trajectory and sometimes are transient. 

All simulations discussed to this point were performed with 
no salt present and resulted in polar contacts forming between 
different surfaces of the antibody. To test the validity of these 
observations for physiological salt concentrations, we also per- 
formed MD runs in the presence of 150 mM NaCl using starting 




Figure 7. Salt bridges between fragments form and break during the 
simulations, and the domain-domain interfaces are heterogeneous 
(different trajectories form different salt bridges, even when the same 
fragments are interacting). A cartoon representation close up of the 
two F ab fragments indicates residues that participate in inter-F ab salt 
bridges for trajectory 0 and trajectory 2 (shown as a representative 
example). Residues that participate in salt bridges in trajectory 0 only 
are shown as cyan spheres. Residues that participate in salt bridges in 
trajectory 2 only are shown as purple spheres. Residues that participate 
in salt bridges in both trajectories are shown as blue spheres. 



models based on both 1HZH and 1IGY structures. We collected 
a total of six trajectories, each 160 ns long. Again, as in the case 
of no salt simulations, we observe that configurations, having at 
least one salt bridge, between any two of the domains, exist. We 
also observe contacts in which no salt bridges between any of the 
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Table 7. Average number of salt bridges between trastuzumab fragments in the context of the full-length antibody 



Starting structure 


Interacting fragments 


Number of salt bridges at 3.2 A 


Number of salt bridges at 5A 


No salt 


Salt 


No salt 


Salt 




F 1-F 2 

ab ab 


1.417 


3.051 


2.934 


6.89 


1HZH 


FJ-F 

ab c 


0.286 


0 


0.646 


0 




ab c 


0.675 


0.936 


1.159 


1.244 




ab ab 


0.549 


0.562 


1.135 


1.093 


1IGY 


FJ-F 

ab c 


0.518 


0.61 


1.641 


1.633 




F h 2-F 

ab c 


0.45 


0.476 


0.83 


0.802 



Trajectories with initial structural models based on both crystal structures (human and mouse IgGI) were used. 



domains exist. In an attempt to understand inter-domain (F b — 
F ab /F ab -F c ) salt bridges of trastuzumab in solution, we compared 
the no salt simulations to the salt simulations by calculating the 
average number of salt bridges between each of the domains and 
determining if salt has an effect on the average number. 

Table 7 lists the average numbers of inter-domain salt 
bridges. The numbers of salt bridges are larger for "salt" simula- 
tion averages than for the "no salt" simulation ones. The differ- 
ences are not significant except for the F ab l-F ab 2 inter-domain 
salt bridges from the simulations with starting models built from 
1HZH template. In theory, adding salt will weaken inter-residue 
electrostatic interactions. In this case however, it appears that 
salt screens the electrostatic repulsion between the F ab fragments 
(each F ab has a net +5 charge at pH 7.0) and does not signif- 
icantly affect the F ab -F c interaction (F is not charged at this 
pH) . We thus conclude that domain-domain interactions (with 
or without polar contacts) exist under both conditions tested. 
Figure S2 provides a summary of the total number of contacts 
formed over the course of salt and no salt trajectories. 

Experimental evidence for inter-fragment interaction comes 
from structural studies of antibodies in crystallized (X-ray crys- 
tallography) or immobilized (electron microscopy) form. X-ray 
crystallography of a full-length human IgG antibody (used as a 
template to build a homology model for this study) reveals sig- 
nificant interaction between one of the F ab fragments and the F c 
fragment. 17 X-ray crystallography of multiple proteins with IgG 
domains reveals a diversity of interaction surfaces between two Ig 
domains or between Ig domains and other proteins, and indeed 
each part of the surface has been part of a domain-domain 
interaction interface. 51 Electron microscopy also reveals confor- 
mations that are consistent with inter-fragment interactions, 1112 
albeit in less detail than X-ray crystallography. Antibodies are 
soluble proteins, and, as such, their surface is polar, which is 
consistent with polar interactions between fragments. Finally, 
there is a high effective local concentration of the F ab /F c frag- 
ments in the vicinity of each other. This high concentration, 
estimated to be millimolar, likely drives these non-specific inter- 
actions. Finally, the flexible hinge region connecting the frag- 
ments allows for multiple relative orientations of the fragments 
exposing different surfaces for potential non-specific inter-frag- 
ment interaction. 

The solution state of trastuzumab is a heterogeneous pop- 
ulation of conformations with inter-fragment interactions. 



To estimate contact rates between trastuzumab domains from 
the simulation data, we use a Markov State Model (MSM) 
approach. An MSM is a kinetic model that describes conforma- 
tional dynamics as transitions between kinetically metastable 
states. If such a set of suitably Markovian states can be identified, 
estimates of transition rates between states can be used to obtain 
a complete description of the thermodynamics and kinetics of the 
system. 52 " 55 MSM approaches have been very useful, for example, 
in integrating large-scale simulation data to build networks of 
protein folding at long time scales. 52 ' 53 

Here, we used the MSM framework to address the more 
modest task of estimating the timescales of motion involved in 
interdomain contact formation. Our goal is to obtain simple esti- 
mates of the conformational dynamics of trastuzumab in solu- 
tion, rather than detailed conclusions about specific mechanisms 
of domain interaction or specific kinetic rates. To do this, we first 
manually define a set of eight conformational states, each having 
a unique combination of interdomain contacts. We then estimate 
rates of making/breaking interdomain contacts from the simula- 
tion data (taking special care in estimating statistical uncertain- 
ties from finite sampling effects). The resulting 8x8 rate matrix 
is diagonalized to obtain a spectrum of implied timescales of 
motion, as described in more detail below. 

We manually chose a state decomposition of 2 3 = 8 states, one 
for each unique configuration of possible inter-domain contacts 
(F ab l-F ab 2, F ab l-F c , and F ab 2-F c ; Fig. 8A). Trajectory snapshots 
were assigned to states based on the number of atomic contacts 
between domains (two atoms are considered in contact if they 
are within 6 A of each other). A threshold between 500 and 800 
atomic contacts is used to determine state transitions: if two 
states are in contact, they must have less than 500 atomic con- 
tacts before they are considered to lose contact, and if two states 
are not in contact, they must have more than 800 atomic con- 
tacts before they are considered in contact. This procedure serves 
to prevent the overestimation of transition rates due to multiple 
barrier crossings. Although the states were chosen manually, 
the decomposition was chosen with the goal of having minimal 
assumptions about the molecular mechanisms underlying con- 
tact formation. While inter-domain contacts may form at a num- 
ber of different interaction sites, it is reasonable in this case to 
consider a simple coarse-grained model that assumes an average 
contact rate for each pair of domains, integrated over all possible 
interactions. 



www.landesbioscience.com 



mAbs 



315 



In keeping with this, we analyzed 12 independent 
simulations started from different initial conformations. 
The starting configurations came from homology mod- 
els built from immunoglobulin crystal coordinates (PDB 
IDs: 1IGY, 1HZH) and coarse-grained ANM models. 
Because of the large size of trastuzumab, the simulations 
do not ergodically sample all possible contact states. In 
many cases, different simulation trajectories sample dis- 
joint regions of state space, due to their very different 
starting conformations. Thus, there is a great amount of 
sampling bias that enters into the MSMs we construct. A 
nonparametric bootstrap analysis is used to estimate the 
variance in contact rates introduced by this finite sam- 
pling bias (see Materials and Methods). 

Figure 8B shows the transition matrix T (for a lag time 
of t = 1 ns) compiled from observed transition counts 
across all 12 simulation trajectories. The transition rates 
show that most of interconversion between states is pre- 
dicted to arise from the making and breaking of F ab l-F c 
and F ab 2-F c contacts. The equilibrium state probabilities 
(Fig. 8C) predicted by the MSM show that states with 
one or two contacts are the most likely. 

The eigenvector structure of the slowest-timescale 
mode shows that the slowest contact events correspond 
to the making and breaking of F ab l-F ab 2 contacts 
(Fig. 8C). The posterior distribution of the slowest 
implied timescales calculated by our bootstrap procedure 
predicts a mean implied timescale of 35.5 ns (Table 8), 
although the range distribution is quite broad, ranging 
from 7.1 ns to ^2.0 jxs (the smallest and largest values 
sampled in the bootstrap, respectively) . 

Based on these results, we conclude that in solu- 
tion, trastuzumab has a heterogeneous population of 
conformations, most with one or more inter-fragment 
interactions. Although accurate estimates of dynamical 
timescales are limited by finite sampling, our best esti- 
mate from the simulation data are that these events occur 
on the ^40 ns timescale, although motions could be as 
slow as microseconds. Hence, we expect trastuzumab in solution 
to be interconverting between these metastable states. 

Materials and Methods 

Molecular dynamics simulations. Molecular dynamics (MD) 
simulations of the complete trastuzumab antibody and its F ab 
fragment were performed with the GROMACS 4.0 simulation 
software package. 36 The simulations were performed on a 256 
processor computer cluster running the Linux operating system 
with Infiniband® network connectivity. 

Homology models were built and used for the initial structures 
of the complete trastuzumab antibody. The templates for the 
homology model were the human IgGl (1HZH) 17 and murine 
IgGl (1IGY) protein crystal structures. 18 Homology models were 
built by structurally aligning the F ab crystal structure (1NZ8) 32 
to the respective template F ab s and performing a direct coordinate 
replacement. Missing atoms, in the template structure, were built 
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Figure 8. (A) A schematic diagram of the three domains of trastuzumab. (B) A 
visual depiction of the transition matrix (T) (for a lag time of t = 1 ns) compiled 
from observed transition counts across all 12 simulation trajectories. (C) Transi- 
tion probabilities from T, shown as a graph. Transition probabilities greater than 
0.02 are shown as a solid arrow. 



Table 8. Mean implied timescales for each relaxation mode in the MSM 
(for a transition matrix T compiled from observed transition counts 
across all 12 simulation trajectories) 



Mode 


Mean implied timescale (ns) 


Standard deviation (ns) 


1 


35.548 


32.240 


2 


11.252 


3.612 


3 


7.897 


2.490 


4 


2.818 


1.384 


5 


0.489 


0.689 


6 


0.154 


0.036 


7 


0.141 


0.023 



in using the SWISS -MODEL 56 " 59 homology modeling server. In 
particular, residues in the upper hinge of one heavy chain were 
missing in the template pdb 1HZH. In addition to missing resi- 
dues, the hinge region of template 1HZH was missing one of two 
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interchain disulfide bonds. The hinge coordinates were relaxed to 
within disulfide bond distance using energy minimization. This 
reconstructed hinge was used to replace the hinge region for tem- 
plate 1IGY, a murine antibody with a different hinge. 

The structure of the two trastuzumab homology models (see 
e.g., Fig. 2) were further analyzed using an anisotropic network 
model (ANM) 56 ' 57 to calculate the normal modes of the trastu- 
zumab antibody (http://ignmtest.ccbb.pitt.edu/cgi-bin/anm/ 
anml.cgi). A 15 -A cutoff was used to define interacting residues. 
Eight structures, four for each of the homology models, along the 
two lowest amplitude modes were selected as additional starting 
structures for the MD, for a total of ten initial starting structures. 
Structures that represented the extremes of the range of motion 
were chosen as starting structures to maximize exploration of 
conformational space. 

All trastuzumab structures were glycosylated in G2 form with 
sugars on HC ASN300 in the Fc domain. 

The 1N8Z trastuzumab F ab crystal structure was used as a 
starting structure for F ab only simulation. The F ab was conjugated 
with N-(l-pyrene) maleimide, which forms a succinimide upon 
conjugation, at LC resides 205 and HC residue 121 in some of 
the F ab only simulations. The initial starting configuration was by 
hand followed by equilibration and randomization. 

The OPLSAA force field 60 ' 61 was used to model the protein. 
The charge state of the titrateable residues was evaluated using 
the empirical method PROPKA. 62 ' 63 All the residues were set 
to their canonical protonation state. The OPLS carbohydrate 
force field 64 ' 65 was used to model the bonded and Lennard- 
Jones parameters of the carbohydrate used to glycosylate the 
full-length antibody. The charges on the carbohydrate were 
obtained using the semi-empirical charge model AM1-BCC 66 
as implemented in the Antechamber software package. 67 ' 68 The 
boned and Lennard-Jones parameters of the N-(l-pyrene) 
maleimide were obtained using the OPLS forcefield where avail- 
able. The charges on the N-(l-pyrene) maleimide were obtained 
AM1-BCC. 

OPLS atomstypes were used for the Alexa 488 atoms. The 
charges were calculated using antechamber. Simulations were 
performed as described above. Briefly, following minimization 
and a 200 ps equilibration, 20 ns simulations in water and in 
methanol were performed. The rotational autocorrelation times 
were calculated using a vector defined in the long dimension 
along the plane of the rings. 26 

The F ab , F c and the full-length antibody were fully solvated 
with SPC 69 water molecules. Approximately 25,500 water mole- 
cules were used to solvate the F , , 33,700 for F and 135,000 water 

ab c 

molecules were used to solvate the full-length antibody. Chloride 
or sodium atoms were added to neutralize the overall charge of 
the system. Octahedral periodic boundary conditions were used 
in each of the simulations. The electrostatic interactions were 
calculated using PME 70 with real space electrostatic cut off of 
1.0 nm. The Lennard-Jones potential, describing the van der 
Walls interaction, was cut off at 1.0 nm. The Settle algorithm 71 
was used to constrain the bond lengths and angles of the water 
molecules; Lines was used to constrain all other bond lengths, 72 
and the site algorithm, in Gromacs 4.0, was used to remove alkyl 



and amide hydrogen motions, allowing for the use of a time-step 
of4fs. 

Throughout these simulations, the temperature was kept 
constant by coupling the system to a temperature bath of 300 K 
using the V-rescale algorithm. 73 During equilibration, the pres- 
sure was kept constant by coupling the system to a pressure bath 
at 1.0 atm. 74 Following equilibration, the simulations were kept 
at constant volume. 

In all cases, following energy minimization, a 200 ps equili- 
bration, at constant pressure, was performed to allow for the den- 
sity of the system to converge. Following equilibration, a total of 
24 trajectories were initiated and analyzed. Of the 24 trajectories, 
19 were for the full-length antibody, four were for the F ab frag- 
ment and one was for the F c fragment. 

The full-length antibody trajectories were further broken 
down into two trajectories starting from a homology model 
based on 1HZH, two starting from homology models based on 
1IGY, eight trajectories (4 1HZH based and 4 1IGY based) using 
structures taken from the ANM calculations (i.e., configurations 
along the modes calculated using ANM) and six trajectories in 
150 mM sodium chloride. 

The initial starting structures for the 150 mM salt simulation 
were selected from the collection of 1HZH and 1IGY trajectories. 
Three of the initial structures for the 150 mM salt calculations 
had no domain-domain interaction as defined as not having a 
salt bridge interaction of less than 5 A. The other three initial 
structures had either a F , 1-F ,2 interaction, F , 1/F or F , 2/F 

ab ab ab c ab c 

interaction present as defined by at least one interdomain salt 
bridge having a distance of less than 5 A. 

All of full-length trastuzumab trajectories except for one 
1HZH based trajectory, were simulated for approximately 80 ns. 
One of the 1HZH based trajectories was simulated for 500 ns. 
The 150 mM salt trajectories were simulated for 167 ns each. 
This resulted in a total simulation time of over 2 (xs for the full- 
length antibody. The two dye conjugated F ab fragment simu- 
lations and one F simulation were each simulated for 400 ns. 

c 

In two of the simulations, the F ab was conjugated with N-(l- 
pyrene) maleimide at LC and 205 and HC residue 121. The F c 
was conjugated with N-(l-pyrene) maleimide at both HC 403 
residues. Two additional F ab simulations without conjugated dyes 
were simulated for 200 ns and used to compare with the conju- 
gated simulation. The complete list of the trajectories is given in 
Table SI. 

The fluorescence anisotropy of the N-(l-pyrene) maleimide, 
attached to the F ab at LC 205 and HC residue 121 as well as 
to the F c at HC 403, was calculated using methods described 
in Schroder et al., 2005. 26 Briefly, the fluorescence anisotropy is 
given by 

r{T) = \{P 2 [fi(t)*H{t+rj\) 

5 (1) 

where |x is a vector representing both the normalized absorption 
and emission dipole moment vectors of the molecule. For the 
purposes of the calculation, the absorption and emission dipole 
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moment vectors are estimated to be equivalent. For the case of 
N-(l-pyrene)-maleimide, the dipole moment vector used to cal- 
culate the fluorescence anisotropy is shown in Figure S3. All the 
anisotropy calculations were performed using the grotacf utility 
in Gromacs 4.0. 

Solvent accessible surface area (SASA). SASA was calculated 
using the g_sas code from gromacs. 

Mutual information. Inter- atomic mutual information (MI) 
for a trajectory from molecular dynamics simulation is defined as 



Aff^=JJp(x i ,x y )log 



r t )\ 



P(x,.,Xj) 



dx.dx , 



(2) 



where 1 ' ' w ' is the displacement of the atomic position 
for the i-th atom relative to its average position over an ensemble; 
p (x.) is the probability density of finding the i-th atom with a dis- 
placement of x.; and p(x^x) is the probability density of finding 
the i-th atom with a displacement of x. and the jth atom with a 
displacement of x.. In our calculations we divided the displace- 
ments into discrete bins {Ar.} and applied a discrete form of the 
Equation 1: 



MI ij= X P(Ar-Ar y )log 



pjAr.Ar) 
p(Ar r )p)Ar j ) 



(3) 



The displacement bins for the atomic positions of the i-th 
atom are as follows: p(Ar) is the probability of finding the i-th 
atom with a displacement in the i-th bin Ar.; p(Ar.,Ar) is the 
probability of finding the i-th atom with a displacement in Ar. 
and the j-th atom with a displacement in Ar.. The summation 
is over all possible displacement bin combinations {Ar.,Ar.}. For 
each dimension, we tested the effects of number of bins and 
found that the patterns of correlations from MI do not change 
with bin numbers greater than 100. Hence, we chose 100 bins 
for each dimension. 

Phi-psi entropy. For each non-terminal residue, the values for 
phi-psi angles were collected over all frames of the entire trajec- 
tory. For this phi-psi distribution, the phi-psi (cp-v|i) entropy is 
defined by the standard Shannon entropy: 



(4) 



Where p(cp,v|i) is the probability of the phi-psi angles falling in 
the { cp,i|i} bin. We used 3° interval to define bins in both phi and 
psi dimensions. 

Inter- domain salt bridges. Trajectories with only the charged 
residues were extracted from full trajectories. The extracted trajec- 
tories were then imported into VMD, and the salt bridge analysis 
tools in VMD were used to dump the time courses for the salt 
bridges. We applied two cutoffs, 3.2 A (VMD's definition) and 5 



A to define the salt bridges. We then sorted the inter-domain salt 
bridges and calculated the various averages. 

MSM construction. Choosing a lag time. Accurate construction 
of an MSM depends on choosing a set of conformational states 
in which the dynamics observed in simulations is sufficiently 
Markovian: i.e., the transition rate 37 k from state j to k (in some 
lag time t) is independent of preceding transitions from states i to 
j (37 k = 37.37 k ). Thus, the lag time t, used to estimate transition rate 
is a key parameter. For example, a very short lag time may not suf- 
ficiently be able to account for the time needed for equilibration 
within each metastable state. As others have done previously, 54 ' 55 
we determined that t = 1 ns is a suitable lag time by constructing 
a series of MSMs using a different lag times (1, 2 and 5 ns) and 
demonstrating that the rate of the slowest relaxation mode of the 
MSM is relatively insensitive to the choice of lag time (data not 
shown) . 

Solution of the master equation. The master equation describ- 
ing the MSM dynamics is dpldt = pK, where p is the state vec- 
tor of populations, and K is an 8 x 8 matrix of rate coefficients. 
Equivalently, this can be expressed in terms of the transition 
matrix T = exp(TK), using the discrete propagation operation 
p(t + 7) = p(t)T, where 37 is the probability of transitioning from 
state / to state j in time t. The solution of the master equation is 
p(t) = p(t = 0) B" 1 exp(-At), where A is a matrix with the eigen- 
values of K, A., as diagonal entries, and B is the matrix of the 
(left) eigenvectors of K as columns. The properties of K are such 
that the largest eigenvalue is zero (corresponding to the stationary 
state, for which dpldt = pK = 0), while the remaining eigenvalues 
X. are negative. The master equation kinetics can thus be described 
as a superposition of exponential relaxations for implied times- 
cales t* = -X." 1 . The sign structure (positive or negative entries) in 
each eigenvector describes the population flux occurring at each 
corresponding timescale. The second eigenvector (i.e., the largest 
non-zero A.) is the slowest kinetic relaxation in the system, and its 
corresponding eigenvector represents states exchanging popula- 
tion on this timescale. In practice, we calculate the implied times- 
cales from the transition matrix T, using the relation 75 



T* = -T/log(|X.) 



(5) 



where |x. are the eigenvalues of T. 

Estimating the transition rate matrix. The entries of the rate 
matrix T are estimated from counts of transitions observed in the 
simulation. Finite sampling error in estimating transition rates 
is propagated as estimation error in the implied timescales, so it 
is this error analysis we are most interested in. Because the rate 
matrix must obey detailed balance (i.e., p. 37 = p. 37, where p. are 
the equilibrium populations), inferring the transition matrix T 
from observed transition counts is a non-trivial problem, and sev- 
eral different methods exist, each with the purpose of inferring 
the posterior distribution of possible T given a finite collection 
of observed data. 76 ' 77 Here, we deal with this estimation prob- 
lem in two ways. First, to ensure that our estimated rate matrix 
obeys detailed balance, we count all transitions in both the 
forward and backward direction (this essentially assumes the 
simulation data are fully equilibrated, as molecular dynamics is 
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microscopically reversible). Second, we estimate the posterior dis- 
tribution of T nonparametrically, using the bootstrap method. 53 
Each bootstrap consists of drawing n trajectories, with replace- 
ment, from our finite set of n = 12 trajectories, and compiling the 
observed counts. Because trastuzumab is structurally symmetric 
(i.e., F ab l = F ab 2), F ab l-F c and F ab 2-F c transitions were counted 
as the same. This bootstrap process was repeated 10,000 times 
to construct a posterior distribution of transition count matrices. 

While the above bootstrap procedure addressed finite sam- 
pling error across trajectories, there is also sampling error due 
to the finite number of transitions observed. In particular, there 
are transitions (for example, to the three-contact state) that are 
not observed in the simulation data. To estimate entries of T for 
which no transitions are observed, we followed a pseudocount 
procedure in which, at each bootstrap iteration, instead of com- 
piling the raw transition counts » from each bootstrapped tra- 
jectory, the counts in each column^ were drawn from a Dirichlet 
distribution 



Dirichlet (p | u) 



Z(u) 



K 

n 

i=l 



P 



Ui — 1 



r(E* i «0 



Z(u) = 



(6) 



(7) 



with parameters u. = » + I IN, where n = 8 is the number of 
states in the MSM. This expression is derived from the Bayesian 
posterior of a multinomial process given the observed transition 
counts, assuming a uniform prior. 77 

Experimental. Protein expression and purification. 
Trastuzumab derivatives with engineered cysteine residues 
(THIOMABS) expressed and purified from Chinese Hamster 
Ovary (CHO) cells have been described previously. 30 For E. coli 
expression of trastuzumab and trastuzumab "hingeless" vari- 
ants, heavy and light chains were cloned into an expression vec- 
tor and expressed as described. 78 ' 79 Proteins were purified by first 
capturing the antibody on protein A column (GE Healthcare, 
MAbSelect SuRe) and subsequently purifying by ion-exchange 
chromatography (GE Healthcare, HiTrap SP FF). 

Site-specific conjugation of THIOMABs. Conjugation of the 
THIOMABs at engineered cysteines has been described previ- 
ously. 30 Briefly, THIOMABs were reduced with 20-fold molar 
excess of DTT at pH 7.0, 25°C over 14-16 h. Reducing agent was 
removed by desalting (GE Healthcare, PD-10). Native disulfide 
bridges were reformed by mild re-oxidation using 10-fold molar 
excess dhAA (Sigma Aldrich) over 3 h at 25°C. Conjugation 
reaction was initiated by addition of 3 -fold molar excess (com- 
plete labeling) of N-(1-Pyrene)maleimide (Invitrogen). Partial 
labeling was achieved using equal molar amount of N-(l- 
Pyrene)maleimide (Invitrogen). Conjugation was performed at 
25°C for 30 min. Any unreacted cysteine residues were alkyl- 
ated using 10-fold molar excess of sodium iodoacetate (Sigma 
Aldrich). Excess dye was removed by desalting (GE Healthcare, 



PD-10). Conjugation efficiency was determined by measuring 
protein and dye absorbance (at 280 nm and 342 nm, respec- 
tively). Samples were concentrated and stored in 50 mM sodium 
acetate buffer pH 5.5 at 5°C. Conjugation specificity was con- 
firmed by mass spectrometry. 

Size-exclusion chromatography. Unconjugated trastuzumab 
variants were analyzed using TSKgel G3000SW XL column 
(10 |xm, 7.8 mm x 30 cm, TOSOH Biosciences) on Agilent 1100 
HPLC system (operated using Chemstation software package). 
Mobile phase used: 0.25 M potassium phosphate buffer pH 6.9, 
0.5 M potassium chloride. The flow rate was 0.5ml/min. These 
conditions were used for analysis of E. ^//-derived hinged and 
"hingeless" THIOMAB LC205 and for monitoring the forma- 
tion of intermolecular disulfide bonds during the re-oxidation 
step of THIOMAB conjugation. 

Pyrene conjugation causes significant interaction of the anti- 
body with the above gel filtration resin. For analysis of conju- 
gated molecules, we chose to use non-silica resin: and Superdex 
200 10/30 GL (GE Healthcare) with PBS as a mobile phase at 
flow rate 0.5 ml/min. 

Hydrophobic-interaction chromatography (HIC). Uncon- 
jugated THIOMAB and THIOMAB conjugates with single or 
two pyrene probes were separated on TSKgel Phenyl NPR-5PW 
column (Tosoh Biosciences, 10 (xm, 7.5 mm x 7.5 cm) using 
linear gradient from 0-35% B over 200 min (buffer A: 1M 
ammonium sulfate, 50 mM sodium phosphate pH 7.0, buffer B: 
50 mM sodium phosphate, 25% isopropanol) at flow rate 
0.8 ml/min. Up to 20 mg of protein sample was loaded and 
purified on AKTA FPLC system (GE Healthcare). 

Fluorescence anisotropy measurements. Sample preparation. 
Pyrene-labeled trastuzumab THIOMABs were diluted from the 
stock solution (100 |xM) into appropriate buffer to final 1-3 (JiM 
in 1 ml quartz cuvette (Starna Cells) for fluorescence measure- 
ments. Trastuzumab molecules with reduced disulfide bonds 
were prepared by treating the antibody solution with 20 x molar 
excess DTT for an appropriate amount of time (20 min or 12 h) 
at pH 7.0 at room temperature and alkylating the exposed disul- 
fides with 100 x molar excess of sodium iodoacetate. Samples 
were desalted (GE Healthcare, PD-10) and stored in 50 mM 
sodium acetate buffer pH 5.5 at 5°C. 

Data acquisition. Time-resolved anisotropy was measured 
using time-correlated single-phonon counting technique 
(TCSPC) using FluoroMax4 (HORIBA Jobin Yvon) steady-state 
spectrofluorometer equipped with FluoroLog TCSPC lifetime 
module. Data was collected using Data Station software sup- 
plied by the manufacturer. A 340 nm NanoLED was used as 
an excitation source (with optical pulse duration of 1.4 ns), and 
decays were collected at 415 nm at 10 nm slit width. Lifetime 
decays were collected with excitation and detection polarizers at 
magic angle conditions. For anisotropy decays, two time decays 
were measured, parallel (/ ) and perpendicular (I VH ) (subscripts 
denote orientations of excitation and emission polarizers, respec- 
tively; V, vertical; H, horizontal). The correction factor (G factor) 
was determined by measuring / and I HH . Instrument response 
function (IRF) was recorded using a dilute solution of colloidal 
silica (Sigma Aldrich). Decay data was analyzed using DAS6 
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software package (HORIBA Jobin Yvon). Briefly, the anisotropy 
decay law r(t): 



tit): 



- GI, 



D(t) 



S(t) Iyy+lGIyjj 



(8) 



was determined using impulse reconvolution analysis by first fit- 
ting the denominator S{t) (total fluorescence decay) to the multi- 
exponential decay using the method of nonlinear least squares. 80 
Using the obtained total fluorescence decay, the product r(t)S(t) 
was convoluted with the IRF to find the best fit to the experimen- 
tal difference decay D(i). A single- or two-exponential anisot- 
ropy decay was used for impulse reconvolution analysis: 



K0 = >taf +3 exp 



i 



+ B 2 exp 



I 



T r2. 



(9) 



Average anisotropy decay time 



£,r H +B 2 r r2 



B. +B 2 



(10) 



parameters of fragments and full-length antibody and showed 
that the same parameters estimated from all-atom molecular 
dynamics simulations were quantitatively similar, thus validating 
the computational models. We further analyzed the atomic details 
of our simulation data by looking at correlated motion between 
residue pairs and phi/psi conformational entropy. Dynamics in 
our models are consistent with previous data such as high struc- 
tural flexibility in the hinge region and heavy chain CDR-3. 
Correlated motion suggests that Ig domains are highly correlated 
along the direction of their (3 strands (e.g., heavy chain variable 
domain to light chain variable domain) and less correlated in the 
"end-to-end" direction (e.g., variable domain to CHI). This result 
indicates that domains that are functionally coupled (e.g., vari- 
able domains that bind to a single antigen) have coupled dynam- 
ics, and the dual functions of the antibody (antigen binding and 
effector binding) are dynamically independent. Importantly, we 
show significant and non-specific inter-fragment domain— domain 
interactions. We incorporate the molecular dynamics data (> 1 
|xs, an order of magnitude more than what was previously pub- 
lished) into a Markov Model and show an equilibrium between 
multiple meta-stable conformations of full-length trastuzumab 
with domain-domain interactions between fragments. The bio- 
logical significance of these domain-domain interactions remains 
to be determined. 



was used as a measure of overall flexibility of the full-length anti- 
body or the antibody fragment with a fluorescence probe at a 
given conjugation position. 81 

Conclusions 

We used experimental and computational methods to characterize 
the solution dynamics of the IgGl antibody trastuzumab and find 
that the full-length antibody exists as a heterogeneous population 
of meta-stable conformations with non-specific domain-domain 
interactions between fragments. We measured the hydrodynamic 
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